Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 1097813 |
| Missing cells | 190924 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 134.0 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 4 |
time has a high cardinality: 45063 distinct values | High cardinality |
gameId is highly correlated with team | High correlation |
frameId is highly correlated with s and 1 other fields | High correlation |
s is highly correlated with dis | High correlation |
dis is highly correlated with s | High correlation |
team is highly correlated with gameId | High correlation |
nflId has 47731 (4.3%) missing values | Missing |
jerseyNumber has 47731 (4.3%) missing values | Missing |
o has 47731 (4.3%) missing values | Missing |
dir has 47731 (4.3%) missing values | Missing |
s has 68859 (6.3%) zeros | Zeros |
a has 64207 (5.8%) zeros | Zeros |
dis has 69209 (6.3%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-02 14:54:51.616857 |
|---|---|
| Analysis finished | 2022-11-02 14:56:29.754358 |
| Duration | 1 minute and 38.14 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2021100993 |
| Minimum | 2021100700 |
|---|---|
| Maximum | 2021101100 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 2021100700 |
|---|---|
| 5-th percentile | 2021100700 |
| Q1 | 2021101002 |
| median | 2021101007 |
| Q3 | 2021101011 |
| 95-th percentile | 2021101100 |
| Maximum | 2021101100 |
| Range | 400 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 80.74804157 |
|---|---|
| Coefficient of variation (CV) | 3.995250204 × 10-8 |
| Kurtosis | 8.455752025 |
| Mean | 2021100993 |
| Median Absolute Deviation (MAD) | 4 |
| Skewness | -2.924403412 |
| Sum | 2.218790945 × 1015 |
| Variance | 6520.246218 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 2021101013 | 86664 | 7.9% |
| 2021101008 | 85606 | 7.8% |
| 2021101100 | 78108 | 7.1% |
| 2021101000 | 76889 | 7.0% |
| 2021101007 | 74704 | 6.8% |
| 2021101002 | 73370 | 6.7% |
| 2021100700 | 70748 | 6.4% |
| 2021101001 | 70472 | 6.4% |
| 2021101009 | 67229 | 6.1% |
| 2021101012 | 66493 | 6.1% |
| Other values (6) | 347530 |
| Value | Count | Frequency (%) |
| 2021100700 | 70748 | |
| 2021101000 | 76889 | |
| 2021101001 | 70472 | |
| 2021101002 | 73370 | |
| 2021101003 | 51612 | |
| 2021101004 | 61686 | |
| 2021101005 | 65895 | |
| 2021101006 | 55453 | |
| 2021101007 | 74704 | |
| 2021101008 | 85606 |
| Value | Count | Frequency (%) |
| 2021101100 | 78108 | |
| 2021101013 | 86664 | |
| 2021101012 | 66493 | |
| 2021101011 | 62813 | |
| 2021101010 | 50071 | |
| 2021101009 | 67229 | |
| 2021101008 | 85606 | |
| 2021101007 | 74704 | |
| 2021101006 | 55453 | |
| 2021101005 | 65895 |
playId
Real number (ℝ≥0)
| Distinct | 978 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2143.464059 |
| Minimum | 54 |
|---|---|
| Maximum | 4597 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 54 |
|---|---|
| 5-th percentile | 233 |
| Q1 | 1117 |
| median | 2176 |
| Q3 | 3157 |
| 95-th percentile | 3956 |
| Maximum | 4597 |
| Range | 4543 |
| Interquartile range (IQR) | 2040 |
Descriptive statistics
| Standard deviation | 1187.342934 |
|---|---|
| Coefficient of variation (CV) | 0.5539364792 |
| Kurtosis | -1.132267128 |
| Mean | 2143.464059 |
| Median Absolute Deviation (MAD) | 1019 |
| Skewness | -0.03467026475 |
| Sum | 2353122709 |
| Variance | 1409783.243 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2599 | 4048 | 0.4% |
| 85 | 3910 | 0.4% |
| 2514 | 3818 | 0.3% |
| 2253 | 3427 | 0.3% |
| 3095 | 3289 | 0.3% |
| 715 | 3289 | 0.3% |
| 1602 | 3266 | 0.3% |
| 3889 | 3128 | 0.3% |
| 2641 | 3105 | 0.3% |
| 2668 | 3059 | 0.3% |
| Other values (968) | 1063474 |
| Value | Count | Frequency (%) |
| 54 | 828 | 0.1% |
| 55 | 897 | 0.1% |
| 62 | 1633 | |
| 63 | 1242 | 0.1% |
| 77 | 2507 | |
| 83 | 782 | 0.1% |
| 85 | 3910 | |
| 86 | 1403 | 0.1% |
| 95 | 1081 | 0.1% |
| 97 | 805 | 0.1% |
| Value | Count | Frequency (%) |
| 4597 | 1403 | |
| 4575 | 1012 | |
| 4553 | 1058 | |
| 4496 | 828 | |
| 4451 | 667 | |
| 4427 | 920 | |
| 4403 | 851 | |
| 4401 | 874 | |
| 4371 | 667 | |
| 4355 | 828 |
| Distinct | 1161 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 47731 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45721.46635 |
| Minimum | 25511 |
|---|---|
| Maximum | 53991 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 25511 |
|---|---|
| 5-th percentile | 37139 |
| Q1 | 42429 |
| median | 45395 |
| Q3 | 48220 |
| 95-th percentile | 53481 |
| Maximum | 53991 |
| Range | 28480 |
| Interquartile range (IQR) | 5791 |
Descriptive statistics
| Standard deviation | 5050.466699 |
|---|---|
| Coefficient of variation (CV) | 0.1104616081 |
| Kurtosis | -0.121562387 |
| Mean | 45721.46635 |
| Median Absolute Deviation (MAD) | 2952 |
| Skewness | -0.178522697 |
| Sum | 4.801128883 × 1010 |
| Variance | 25507213.88 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 43367 | 2671 | 0.2% |
| 53492 | 2671 | 0.2% |
| 52504 | 2671 | 0.2% |
| 40107 | 2671 | 0.2% |
| 40166 | 2671 | 0.2% |
| 53655 | 2671 | 0.2% |
| 46152 | 2671 | 0.2% |
| 46085 | 2671 | 0.2% |
| 44839 | 2671 | 0.2% |
| 44822 | 2671 | 0.2% |
| Other values (1151) | 1023372 | |
| (Missing) | 47731 | 4.3% |
| Value | Count | Frequency (%) |
| 25511 | 1431 | |
| 28963 | 858 | |
| 29550 | 893 | |
| 29851 | 1517 | |
| 30842 | 401 | < 0.1% |
| 30869 | 1298 | |
| 33084 | 1695 | |
| 33107 | 1778 | |
| 33130 | 520 | < 0.1% |
| 33131 | 1032 |
| Value | Count | Frequency (%) |
| 53991 | 128 | < 0.1% |
| 53957 | 932 | |
| 53953 | 1374 | |
| 53946 | 135 | < 0.1% |
| 53900 | 449 | < 0.1% |
| 53876 | 284 | < 0.1% |
| 53819 | 36 | < 0.1% |
| 53687 | 976 | |
| 53674 | 798 | |
| 53668 | 45 | < 0.1% |
| Distinct | 124 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.83895162 |
| Minimum | 1 |
|---|---|
| Maximum | 124 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 22 |
| Q3 | 33 |
| 95-th percentile | 53 |
| Maximum | 124 |
| Range | 123 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 16.06365415 |
|---|---|
| Coefficient of variation (CV) | 0.6738406287 |
| Kurtosis | 2.064723907 |
| Mean | 23.83895162 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 1.040341615 |
| Sum | 26170711 |
| Variance | 258.0409847 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 25484 | 2.3% |
| 14 | 25484 | 2.3% |
| 24 | 25484 | 2.3% |
| 23 | 25484 | 2.3% |
| 22 | 25484 | 2.3% |
| 21 | 25484 | 2.3% |
| 20 | 25484 | 2.3% |
| 19 | 25484 | 2.3% |
| 18 | 25484 | 2.3% |
| 17 | 25484 | 2.3% |
| Other values (114) | 842973 |
| Value | Count | Frequency (%) |
| 1 | 25484 | |
| 2 | 25484 | |
| 3 | 25484 | |
| 4 | 25484 | |
| 5 | 25484 | |
| 6 | 25484 | |
| 7 | 25484 | |
| 8 | 25484 | |
| 9 | 25484 | |
| 10 | 25484 |
| Value | Count | Frequency (%) |
| 124 | 23 | < 0.1% |
| 123 | 46 | |
| 122 | 46 | |
| 121 | 46 | |
| 120 | 69 | |
| 119 | 69 | |
| 118 | 69 | |
| 117 | 69 | |
| 116 | 69 | |
| 115 | 69 |
| Distinct | 45063 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.4 MiB |
| 2021-10-10T17:24:24.300 | 69 |
|---|---|
| 2021-10-10T18:52:51.200 | 69 |
| 2021-10-10T18:52:51.400 | 69 |
| 2021-10-10T18:52:51.500 | 69 |
| 2021-10-10T18:52:51.600 | 69 |
| Other values (45058) |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 25249699 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | 2021-10-08T00:23:33.900 |
|---|---|
| 2nd row | 2021-10-08T00:23:34.000 |
| 3rd row | 2021-10-08T00:23:34.100 |
| 4th row | 2021-10-08T00:23:34.200 |
| 5th row | 2021-10-08T00:23:34.300 |
Common Values
| Value | Count | Frequency (%) |
| 2021-10-10T17:24:24.300 | 69 | < 0.1% |
| 2021-10-10T18:52:51.200 | 69 | < 0.1% |
| 2021-10-10T18:52:51.400 | 69 | < 0.1% |
| 2021-10-10T18:52:51.500 | 69 | < 0.1% |
| 2021-10-10T18:52:51.600 | 69 | < 0.1% |
| 2021-10-10T19:56:43.100 | 69 | < 0.1% |
| 2021-10-10T19:17:14.200 | 69 | < 0.1% |
| 2021-10-10T19:17:14.100 | 69 | < 0.1% |
| 2021-10-10T19:17:14.000 | 69 | < 0.1% |
| 2021-10-10T19:17:13.900 | 69 | < 0.1% |
| Other values (45053) | 1097123 |
Length
| Value | Count | Frequency (%) |
| 2021-10-10t17:24:24.300 | 69 | < 0.1% |
| 2021-10-10t18:15:54.000 | 69 | < 0.1% |
| 2021-10-10t18:10:04.500 | 69 | < 0.1% |
| 2021-10-10t17:44:14.600 | 69 | < 0.1% |
| 2021-10-10t17:44:14.700 | 69 | < 0.1% |
| 2021-10-10t17:44:14.800 | 69 | < 0.1% |
| 2021-10-10t17:44:14.900 | 69 | < 0.1% |
| 2021-10-10t17:44:15.000 | 69 | < 0.1% |
| 2021-10-10t19:17:16.300 | 69 | < 0.1% |
| 2021-10-10t19:17:16.200 | 69 | < 0.1% |
| Other values (45053) | 1097123 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 6429738 | |
| 1 | 4713919 | |
| 2 | 3426195 | |
| - | 2195626 | 8.7% |
| : | 2195626 | 8.7% |
| T | 1097813 | 4.3% |
| . | 1097813 | 4.3% |
| 3 | 755877 | 3.0% |
| 4 | 733425 | 2.9% |
| 5 | 682457 | 2.7% |
| Other values (4) | 1921210 | 7.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 18662821 | |
| Other Punctuation | 3293439 | 13.0% |
| Dash Punctuation | 2195626 | 8.7% |
| Uppercase Letter | 1097813 | 4.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 6429738 | |
| 1 | 4713919 | |
| 2 | 3426195 | |
| 3 | 755877 | 4.1% |
| 4 | 733425 | 3.9% |
| 5 | 682457 | 3.7% |
| 8 | 549449 | 2.9% |
| 9 | 507448 | 2.7% |
| 7 | 495324 | 2.7% |
| 6 | 368989 | 2.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 2195626 | |
| . | 1097813 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 2195626 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 1097813 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 24151886 | |
| Latin | 1097813 | 4.3% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 6429738 | |
| 1 | 4713919 | |
| 2 | 3426195 | |
| - | 2195626 | 9.1% |
| : | 2195626 | 9.1% |
| . | 1097813 | 4.5% |
| 3 | 755877 | 3.1% |
| 4 | 733425 | 3.0% |
| 5 | 682457 | 2.8% |
| 8 | 549449 | 2.3% |
| Other values (3) | 1371761 | 5.7% |
Latin
| Value | Count | Frequency (%) |
| T | 1097813 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 25249699 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 6429738 | |
| 1 | 4713919 | |
| 2 | 3426195 | |
| - | 2195626 | 8.7% |
| : | 2195626 | 8.7% |
| T | 1097813 | 4.3% |
| . | 1097813 | 4.3% |
| 3 | 755877 | 3.0% |
| 4 | 733425 | 2.9% |
| 5 | 682457 | 2.7% |
| Other values (4) | 1921210 | 7.6% |
| Distinct | 98 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 47731 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 50.15997798 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 23 |
| median | 52 |
| Q3 | 76 |
| 95-th percentile | 96 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 53 |
Descriptive statistics
| Standard deviation | 29.82252554 |
|---|---|
| Coefficient of variation (CV) | 0.5945482184 |
| Kurtosis | -1.317059502 |
| Mean | 50.15997798 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 0.02384140107 |
| Sum | 52672090 |
| Variance | 889.3830298 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23 | 21810 | 2.0% |
| 24 | 20766 | 1.9% |
| 97 | 20291 | 1.8% |
| 21 | 20005 | 1.8% |
| 11 | 18360 | 1.7% |
| 55 | 17503 | 1.6% |
| 72 | 16670 | 1.5% |
| 91 | 16208 | 1.5% |
| 99 | 16088 | 1.5% |
| 2 | 16010 | 1.5% |
| Other values (88) | 866371 | |
| (Missing) | 47731 | 4.3% |
| Value | Count | Frequency (%) |
| 1 | 14121 | |
| 2 | 16010 | |
| 3 | 5823 | 0.5% |
| 4 | 13474 | |
| 5 | 9312 | |
| 6 | 7360 | |
| 7 | 6763 | |
| 8 | 10217 | |
| 9 | 7222 | |
| 10 | 10604 |
| Value | Count | Frequency (%) |
| 99 | 16088 | |
| 98 | 15556 | |
| 97 | 20291 | |
| 96 | 10903 | |
| 95 | 8368 | |
| 94 | 15665 | |
| 93 | 10761 | |
| 92 | 5185 | 0.5% |
| 91 | 16208 | |
| 90 | 14426 |
| Distinct | 33 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.4 MiB |
| football | 47731 |
|---|---|
| KC | 41448 |
| BUF | 41448 |
| WAS | 40942 |
| NO | 40942 |
| Other values (28) |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 2.975350082 |
| Min length | 2 |
Characters and Unicode
| Total characters | 3266378 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | LA |
|---|---|
| 2nd row | LA |
| 3rd row | LA |
| 4th row | LA |
| 5th row | LA |
Common Values
| Value | Count | Frequency (%) |
| football | 47731 | 4.3% |
| KC | 41448 | 3.8% |
| BUF | 41448 | 3.8% |
| WAS | 40942 | 3.7% |
| NO | 40942 | 3.7% |
| BAL | 37356 | 3.4% |
| IND | 37356 | 3.4% |
| ATL | 36773 | 3.3% |
| NYJ | 36773 | 3.3% |
| MIA | 35728 | 3.3% |
| Other values (23) | 701316 |
Length
| Value | Count | Frequency (%) |
| football | 47731 | 4.3% |
| kc | 41448 | 3.8% |
| buf | 41448 | 3.8% |
| was | 40942 | 3.7% |
| no | 40942 | 3.7% |
| bal | 37356 | 3.4% |
| ind | 37356 | 3.4% |
| atl | 36773 | 3.3% |
| nyj | 36773 | 3.3% |
| mia | 35728 | 3.3% |
| Other values (23) | 701316 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 375672 | 11.5% |
| N | 294184 | 9.0% |
| I | 253902 | 7.8% |
| L | 228019 | 7.0% |
| C | 198495 | 6.1% |
| E | 178211 | 5.5% |
| T | 160039 | 4.9% |
| B | 149622 | 4.6% |
| D | 127193 | 3.9% |
| S | 104819 | 3.2% |
| Other values (20) | 1196222 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2884530 | |
| Lowercase Letter | 381848 | 11.7% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 375672 | |
| N | 294184 | 10.2% |
| I | 253902 | 8.8% |
| L | 228019 | 7.9% |
| C | 198495 | 6.9% |
| E | 178211 | 6.2% |
| T | 160039 | 5.5% |
| B | 149622 | 5.2% |
| D | 127193 | 4.4% |
| S | 104819 | 3.6% |
| Other values (14) | 814374 |
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 95462 | |
| o | 95462 | |
| f | 47731 | |
| a | 47731 | |
| b | 47731 | |
| t | 47731 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3266378 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 375672 | 11.5% |
| N | 294184 | 9.0% |
| I | 253902 | 7.8% |
| L | 228019 | 7.0% |
| C | 198495 | 6.1% |
| E | 178211 | 5.5% |
| T | 160039 | 4.9% |
| B | 149622 | 4.6% |
| D | 127193 | 3.9% |
| S | 104819 | 3.2% |
| Other values (20) | 1196222 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3266378 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 375672 | 11.5% |
| N | 294184 | 9.0% |
| I | 253902 | 7.8% |
| L | 228019 | 7.0% |
| C | 198495 | 6.1% |
| E | 178211 | 5.5% |
| T | 160039 | 4.9% |
| B | 149622 | 4.6% |
| D | 127193 | 3.9% |
| S | 104819 | 3.2% |
| Other values (20) | 1196222 |
playDirection
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.4 MiB |
| left | |
|---|---|
| right |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.489095137 |
| Min length | 4 |
Characters and Unicode
| Total characters | 4928187 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | right |
|---|---|
| 2nd row | right |
| 3rd row | right |
| 4th row | right |
| 5th row | right |
Common Values
| Value | Count | Frequency (%) |
| left | 560878 | |
| right | 536935 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| left | 560878 | |
| right | 536935 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 1097813 | |
| l | 560878 | |
| e | 560878 | |
| f | 560878 | |
| r | 536935 | |
| i | 536935 | |
| g | 536935 | |
| h | 536935 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4928187 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 1097813 | |
| l | 560878 | |
| e | 560878 | |
| f | 560878 | |
| r | 536935 | |
| i | 536935 | |
| g | 536935 | |
| h | 536935 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4928187 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 1097813 | |
| l | 560878 | |
| e | 560878 | |
| f | 560878 | |
| r | 536935 | |
| i | 536935 | |
| g | 536935 | |
| h | 536935 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4928187 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 1097813 | |
| l | 560878 | |
| e | 560878 | |
| f | 560878 | |
| r | 536935 | |
| i | 536935 | |
| g | 536935 | |
| h | 536935 |
x
Real number (ℝ)
| Distinct | 11786 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59.77382293 |
| Minimum | -1.72 |
|---|---|
| Maximum | 119.57 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 15 |
| Negative (%) | < 0.1% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | -1.72 |
|---|---|
| 5-th percentile | 20.33 |
| Q1 | 40.91 |
| median | 59.38 |
| Q3 | 78.42 |
| 95-th percentile | 100.41 |
| Maximum | 119.57 |
| Range | 121.29 |
| Interquartile range (IQR) | 37.51 |
Descriptive statistics
| Standard deviation | 24.28968786 |
|---|---|
| Coefficient of variation (CV) | 0.4063599527 |
| Kurtosis | -0.7464431436 |
| Mean | 59.77382293 |
| Median Absolute Deviation (MAD) | 18.74 |
| Skewness | 0.05042027758 |
| Sum | 65620479.87 |
| Variance | 589.9889364 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 34.67 | 208 | < 0.1% |
| 66.24 | 203 | < 0.1% |
| 41.63 | 202 | < 0.1% |
| 43.37 | 198 | < 0.1% |
| 63.67 | 195 | < 0.1% |
| 40.6 | 194 | < 0.1% |
| 48.42 | 194 | < 0.1% |
| 45.59 | 193 | < 0.1% |
| 44.57 | 193 | < 0.1% |
| 54.65 | 191 | < 0.1% |
| Other values (11776) | 1095842 |
| Value | Count | Frequency (%) |
| -1.72 | 1 | |
| -1.7 | 1 | |
| -1.69 | 1 | |
| -1.66 | 1 | |
| -1.59 | 1 | |
| -1.5 | 1 | |
| -1.38 | 1 | |
| -1.23 | 1 | |
| -1.07 | 1 | |
| -0.88 | 1 |
| Value | Count | Frequency (%) |
| 119.57 | 1 | |
| 119.54 | 1 | |
| 119.5 | 1 | |
| 119.48 | 1 | |
| 119.4 | 1 | |
| 119.35 | 1 | |
| 119.29 | 1 | |
| 119.26 | 1 | |
| 119.17 | 1 | |
| 119.08 | 1 |
y
Real number (ℝ)
| Distinct | 5424 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.78455619 |
| Minimum | -2.02 |
|---|---|
| Maximum | 56.59 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 23 |
| Negative (%) | < 0.1% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | -2.02 |
|---|---|
| 5-th percentile | 11.71 |
| Q1 | 22 |
| median | 26.8 |
| Q3 | 31.63 |
| 95-th percentile | 41.73 |
| Maximum | 56.59 |
| Range | 58.61 |
| Interquartile range (IQR) | 9.63 |
Descriptive statistics
| Standard deviation | 8.312595357 |
|---|---|
| Coefficient of variation (CV) | 0.3103503114 |
| Kurtosis | 0.3263039226 |
| Mean | 26.78455619 |
| Median Absolute Deviation (MAD) | 4.81 |
| Skewness | -0.01627168227 |
| Sum | 29404433.99 |
| Variance | 69.09924156 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 29.75 | 1105 | 0.1% |
| 23.75 | 1101 | 0.1% |
| 29.81 | 1089 | 0.1% |
| 29.83 | 1088 | 0.1% |
| 23.74 | 1082 | 0.1% |
| 29.8 | 1076 | 0.1% |
| 23.82 | 1071 | 0.1% |
| 23.7 | 1067 | 0.1% |
| 23.73 | 1054 | 0.1% |
| 29.79 | 1053 | 0.1% |
| Other values (5414) | 1087027 |
| Value | Count | Frequency (%) |
| -2.02 | 1 | |
| -2.01 | 1 | |
| -2 | 1 | |
| -1.97 | 1 | |
| -1.95 | 1 | |
| -1.91 | 1 | |
| -1.87 | 1 | |
| -1.81 | 1 | |
| -1.78 | 1 | |
| -1.68 | 1 |
| Value | Count | Frequency (%) |
| 56.59 | 1 | |
| 56.51 | 1 | |
| 56.41 | 1 | |
| 56.31 | 1 | |
| 56.2 | 1 | |
| 56.08 | 1 | |
| 55.99 | 1 | |
| 55.95 | 1 | |
| 55.82 | 1 | |
| 55.75 | 1 |
| Distinct | 2157 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.598595617 |
| Minimum | 0 |
|---|---|
| Maximum | 29.34 |
| Zeros | 68859 |
| Zeros (%) | 6.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.76 |
| median | 2.14 |
| Q3 | 3.83 |
| 95-th percentile | 6.83 |
| Maximum | 29.34 |
| Range | 29.34 |
| Interquartile range (IQR) | 3.07 |
Descriptive statistics
| Standard deviation | 2.410041143 |
|---|---|
| Coefficient of variation (CV) | 0.9274398555 |
| Kurtosis | 14.17000886 |
| Mean | 2.598595617 |
| Median Absolute Deviation (MAD) | 1.5 |
| Skewness | 2.341926989 |
| Sum | 2852772.05 |
| Variance | 5.808298313 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 68859 | 6.3% |
| 0.01 | 16548 | 1.5% |
| 0.02 | 9333 | 0.9% |
| 0.03 | 7047 | 0.6% |
| 0.04 | 5777 | 0.5% |
| 0.05 | 5107 | 0.5% |
| 0.06 | 4728 | 0.4% |
| 0.07 | 4231 | 0.4% |
| 0.08 | 3909 | 0.4% |
| 0.09 | 3874 | 0.4% |
| Other values (2147) | 968400 |
| Value | Count | Frequency (%) |
| 0 | 68859 | |
| 0.01 | 16548 | 1.5% |
| 0.02 | 9333 | 0.9% |
| 0.03 | 7047 | 0.6% |
| 0.04 | 5777 | 0.5% |
| 0.05 | 5107 | 0.5% |
| 0.06 | 4728 | 0.4% |
| 0.07 | 4231 | 0.4% |
| 0.08 | 3909 | 0.4% |
| 0.09 | 3874 | 0.4% |
| Value | Count | Frequency (%) |
| 29.34 | 1 | |
| 29 | 1 | |
| 28.48 | 1 | |
| 27.39 | 1 | |
| 27.28 | 1 | |
| 27.24 | 1 | |
| 27.18 | 1 | |
| 27.12 | 1 | |
| 27.07 | 1 | |
| 27.05 | 1 |
| Distinct | 1587 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.782808083 |
| Minimum | 0 |
|---|---|
| Maximum | 27.26 |
| Zeros | 64207 |
| Zeros (%) | 5.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.71 |
| median | 1.53 |
| Q3 | 2.57 |
| 95-th percentile | 4.44 |
| Maximum | 27.26 |
| Range | 27.26 |
| Interquartile range (IQR) | 1.86 |
Descriptive statistics
| Standard deviation | 1.435226143 |
|---|---|
| Coefficient of variation (CV) | 0.8050368163 |
| Kurtosis | 6.592413885 |
| Mean | 1.782808083 |
| Median Absolute Deviation (MAD) | 0.91 |
| Skewness | 1.472769507 |
| Sum | 1957189.89 |
| Variance | 2.059874082 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 64207 | 5.8% |
| 0.01 | 12817 | 1.2% |
| 0.02 | 7246 | 0.7% |
| 0.03 | 5485 | 0.5% |
| 0.04 | 4643 | 0.4% |
| 0.05 | 3820 | 0.3% |
| 1.33 | 3525 | 0.3% |
| 1.2 | 3521 | 0.3% |
| 1.04 | 3506 | 0.3% |
| 1.12 | 3503 | 0.3% |
| Other values (1577) | 985540 |
| Value | Count | Frequency (%) |
| 0 | 64207 | |
| 0.01 | 12817 | 1.2% |
| 0.02 | 7246 | 0.7% |
| 0.03 | 5485 | 0.5% |
| 0.04 | 4643 | 0.4% |
| 0.05 | 3820 | 0.3% |
| 0.06 | 3358 | 0.3% |
| 0.07 | 3045 | 0.3% |
| 0.08 | 2780 | 0.3% |
| 0.09 | 2620 | 0.2% |
| Value | Count | Frequency (%) |
| 27.26 | 1 | |
| 26.43 | 1 | |
| 26.25 | 1 | |
| 26.14 | 1 | |
| 25.48 | 1 | |
| 24.28 | 1 | |
| 24.15 | 1 | |
| 23.74 | 1 | |
| 23.44 | 1 | |
| 23.34 | 1 |
| Distinct | 538 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2631515294 |
| Minimum | 0 |
|---|---|
| Maximum | 7.1 |
| Zeros | 69209 |
| Zeros (%) | 6.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.08 |
| median | 0.22 |
| Q3 | 0.38 |
| 95-th percentile | 0.68 |
| Maximum | 7.1 |
| Range | 7.1 |
| Interquartile range (IQR) | 0.3 |
Descriptive statistics
| Standard deviation | 0.2570572949 |
|---|---|
| Coefficient of variation (CV) | 0.9768413484 |
| Kurtosis | 47.47513465 |
| Mean | 0.2631515294 |
| Median Absolute Deviation (MAD) | 0.15 |
| Skewness | 4.09396423 |
| Sum | 288891.17 |
| Variance | 0.06607845284 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 69209 | 6.3% |
| 0.01 | 57954 | 5.3% |
| 0.02 | 33452 | 3.0% |
| 0.03 | 25844 | 2.4% |
| 0.04 | 23066 | 2.1% |
| 0.05 | 21323 | 1.9% |
| 0.21 | 20481 | 1.9% |
| 0.06 | 20397 | 1.9% |
| 0.2 | 20209 | 1.8% |
| 0.19 | 20182 | 1.8% |
| Other values (528) | 785696 |
| Value | Count | Frequency (%) |
| 0 | 69209 | |
| 0.01 | 57954 | |
| 0.02 | 33452 | |
| 0.03 | 25844 | 2.4% |
| 0.04 | 23066 | 2.1% |
| 0.05 | 21323 | 1.9% |
| 0.06 | 20397 | 1.9% |
| 0.07 | 20044 | 1.8% |
| 0.08 | 19495 | 1.8% |
| 0.09 | 19363 | 1.8% |
| Value | Count | Frequency (%) |
| 7.1 | 1 | |
| 6.91 | 1 | |
| 6.79 | 1 | |
| 6.63 | 1 | |
| 6.53 | 1 | |
| 6.49 | 1 | |
| 6.43 | 1 | |
| 6.17 | 1 | |
| 6.01 | 1 | |
| 6 | 2 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 47731 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.8714058 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 17 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 31.08 |
| Q1 | 90.33 |
| median | 179.93 |
| Q3 | 269.79 |
| 95-th percentile | 331.08 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 179.46 |
Descriptive statistics
| Standard deviation | 99.23953905 |
|---|---|
| Coefficient of variation (CV) | 0.5486745604 |
| Kurtosis | -1.356028944 |
| Mean | 180.8714058 |
| Median Absolute Deviation (MAD) | 89.73 |
| Skewness | -0.0003806155838 |
| Sum | 189929807.5 |
| Variance | 9848.486111 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 501 | < 0.1% |
| 264.52 | 111 | < 0.1% |
| 265.75 | 110 | < 0.1% |
| 269.43 | 103 | < 0.1% |
| 92.08 | 103 | < 0.1% |
| 265.08 | 103 | < 0.1% |
| 267.47 | 101 | < 0.1% |
| 89.25 | 101 | < 0.1% |
| 85.9 | 100 | < 0.1% |
| 269.41 | 100 | < 0.1% |
| Other values (35991) | 1048649 | |
| (Missing) | 47731 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 17 | |
| 0.01 | 12 | |
| 0.02 | 14 | |
| 0.03 | 12 | |
| 0.04 | 17 | |
| 0.05 | 16 | |
| 0.06 | 15 | |
| 0.07 | 14 | |
| 0.08 | 27 | |
| 0.09 | 14 |
| Value | Count | Frequency (%) |
| 360 | 14 | |
| 359.99 | 28 | |
| 359.98 | 11 | < 0.1% |
| 359.97 | 28 | |
| 359.96 | 19 | |
| 359.95 | 17 | |
| 359.94 | 15 | |
| 359.93 | 26 | |
| 359.92 | 18 | |
| 359.91 | 15 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 47731 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.8815035 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 28 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 23.47 |
| Q1 | 90.93 |
| median | 180.04 |
| Q3 | 271.18 |
| 95-th percentile | 337.68 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 180.25 |
Descriptive statistics
| Standard deviation | 101.4632771 |
|---|---|
| Coefficient of variation (CV) | 0.5609378245 |
| Kurtosis | -1.284141938 |
| Mean | 180.8815035 |
| Median Absolute Deviation (MAD) | 90.13 |
| Skewness | -0.004000071251 |
| Sum | 189940411 |
| Variance | 10294.79659 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 273.22 | 72 | < 0.1% |
| 262.9 | 72 | < 0.1% |
| 88.26 | 72 | < 0.1% |
| 272.41 | 71 | < 0.1% |
| 95.6 | 71 | < 0.1% |
| 96.1 | 70 | < 0.1% |
| 271.96 | 70 | < 0.1% |
| 274.51 | 70 | < 0.1% |
| 261.32 | 69 | < 0.1% |
| 272.61 | 69 | < 0.1% |
| Other values (35991) | 1049376 | |
| (Missing) | 47731 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 28 | |
| 0.01 | 30 | |
| 0.02 | 22 | |
| 0.03 | 27 | |
| 0.04 | 25 | |
| 0.05 | 27 | |
| 0.06 | 28 | |
| 0.07 | 16 | |
| 0.08 | 30 | |
| 0.09 | 33 |
| Value | Count | Frequency (%) |
| 360 | 10 | < 0.1% |
| 359.99 | 20 | |
| 359.98 | 30 | |
| 359.97 | 21 | |
| 359.96 | 18 | |
| 359.95 | 20 | |
| 359.94 | 26 | |
| 359.93 | 30 | |
| 359.92 | 26 | |
| 359.91 | 26 |
event
Categorical
| Distinct | 19 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.4 MiB |
| None | |
|---|---|
| ball_snap | 25415 |
| pass_forward | 22747 |
| autoevent_ballsnap | 11569 |
| autoevent_passforward | 11178 |
| Other values (14) | 12949 |
Length
| Max length | 25 |
|---|---|
| Median length | 4 |
| Mean length | 4.676771909 |
| Min length | 3 |
Characters and Unicode
| Total characters | 5134221 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | None |
|---|---|
| 2nd row | None |
| 3rd row | None |
| 4th row | None |
| 5th row | None |
Common Values
| Value | Count | Frequency (%) |
| None | 1013955 | |
| ball_snap | 25415 | 2.3% |
| pass_forward | 22747 | 2.1% |
| autoevent_ballsnap | 11569 | 1.1% |
| autoevent_passforward | 11178 | 1.0% |
| play_action | 6049 | 0.6% |
| run | 1403 | 0.1% |
| qb_sack | 1219 | 0.1% |
| pass_arrived | 1058 | 0.1% |
| shift | 690 | 0.1% |
| Other values (9) | 2530 | 0.2% |
Length
| Value | Count | Frequency (%) |
| none | 1013955 | |
| ball_snap | 25415 | 2.3% |
| pass_forward | 22747 | 2.1% |
| autoevent_ballsnap | 11569 | 1.1% |
| autoevent_passforward | 11178 | 1.0% |
| play_action | 6049 | 0.6% |
| run | 1403 | 0.1% |
| qb_sack | 1219 | 0.1% |
| pass_arrived | 1058 | 0.1% |
| shift | 690 | 0.1% |
| Other values (9) | 2530 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 1084818 | |
| o | 1079114 | |
| e | 1064164 | |
| N | 1013955 | |
| a | 182827 | 3.6% |
| s | 112056 | 2.2% |
| _ | 82593 | 1.6% |
| p | 80891 | 1.6% |
| l | 80408 | 1.6% |
| r | 72979 | 1.4% |
| Other values (15) | 280416 | 5.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4037673 | |
| Uppercase Letter | 1013955 | 19.7% |
| Connector Punctuation | 82593 | 1.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 1084818 | |
| o | 1079114 | |
| e | 1064164 | |
| a | 182827 | 4.5% |
| s | 112056 | 2.8% |
| p | 80891 | 2.0% |
| l | 80408 | 2.0% |
| r | 72979 | 1.8% |
| t | 57224 | 1.4% |
| b | 38341 | 0.9% |
| Other values (13) | 184851 | 4.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 1013955 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 82593 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5051628 | |
| Common | 82593 | 1.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 1084818 | |
| o | 1079114 | |
| e | 1064164 | |
| N | 1013955 | |
| a | 182827 | 3.6% |
| s | 112056 | 2.2% |
| p | 80891 | 1.6% |
| l | 80408 | 1.6% |
| r | 72979 | 1.4% |
| t | 57224 | 1.1% |
| Other values (14) | 223192 | 4.4% |
Common
| Value | Count | Frequency (%) |
| _ | 82593 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5134221 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 1084818 | |
| o | 1079114 | |
| e | 1064164 | |
| N | 1013955 | |
| a | 182827 | 3.6% |
| s | 112056 | 2.2% |
| _ | 82593 | 1.6% |
| p | 80891 | 1.6% |
| l | 80408 | 1.6% |
| r | 72979 | 1.4% |
| Other values (15) | 280416 | 5.5% |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021100700 | 95 | 30869.0 | 1 | 2021-10-08T00:23:33.900 | 77.0 | LA | right | 22.93 | 33.04 | 0.02 | 0.02 | 0.00 | 102.26 | 129.51 | None |
| 1 | 2021100700 | 95 | 30869.0 | 2 | 2021-10-08T00:23:34.000 | 77.0 | LA | right | 22.93 | 33.04 | 0.01 | 0.02 | 0.00 | 101.64 | 128.68 | None |
| 2 | 2021100700 | 95 | 30869.0 | 3 | 2021-10-08T00:23:34.100 | 77.0 | LA | right | 22.92 | 33.05 | 0.01 | 0.01 | 0.01 | 100.73 | 127.26 | None |
| 3 | 2021100700 | 95 | 30869.0 | 4 | 2021-10-08T00:23:34.200 | 77.0 | LA | right | 22.92 | 33.05 | 0.01 | 0.01 | 0.01 | 100.73 | 130.90 | None |
| 4 | 2021100700 | 95 | 30869.0 | 5 | 2021-10-08T00:23:34.300 | 77.0 | LA | right | 22.91 | 33.05 | 0.01 | 0.01 | 0.01 | 99.55 | 134.16 | None |
| 5 | 2021100700 | 95 | 30869.0 | 6 | 2021-10-08T00:23:34.400 | 77.0 | LA | right | 22.91 | 33.05 | 0.01 | 0.01 | 0.00 | 99.55 | 134.37 | ball_snap |
| 6 | 2021100700 | 95 | 30869.0 | 7 | 2021-10-08T00:23:34.500 | 77.0 | LA | right | 22.90 | 33.05 | 0.01 | 0.01 | 0.01 | 99.55 | 138.57 | autoevent_ballsnap |
| 7 | 2021100700 | 95 | 30869.0 | 8 | 2021-10-08T00:23:34.600 | 77.0 | LA | right | 22.89 | 33.06 | 0.00 | 0.00 | 0.02 | 98.54 | 144.90 | None |
| 8 | 2021100700 | 95 | 30869.0 | 9 | 2021-10-08T00:23:34.700 | 77.0 | LA | right | 22.88 | 33.07 | 0.00 | 0.00 | 0.01 | 97.83 | 143.93 | None |
| 9 | 2021100700 | 95 | 30869.0 | 10 | 2021-10-08T00:23:34.800 | 77.0 | LA | right | 22.87 | 33.08 | 0.00 | 0.00 | 0.01 | 96.94 | 150.31 | None |
Last rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1097803 | 2021101100 | 4401 | NaN | 29 | 2021-10-12T03:30:05.900 | NaN | football | right | 97.06 | 21.91 | 1.58 | 2.81 | 0.14 | NaN | NaN | None |
| 1097804 | 2021101100 | 4401 | NaN | 30 | 2021-10-12T03:30:06.000 | NaN | football | right | 97.13 | 21.75 | 1.80 | 2.51 | 0.17 | NaN | NaN | None |
| 1097805 | 2021101100 | 4401 | NaN | 31 | 2021-10-12T03:30:06.100 | NaN | football | right | 97.23 | 21.59 | 1.97 | 2.16 | 0.19 | NaN | NaN | None |
| 1097806 | 2021101100 | 4401 | NaN | 32 | 2021-10-12T03:30:06.200 | NaN | football | right | 97.35 | 21.43 | 2.12 | 1.67 | 0.20 | NaN | NaN | None |
| 1097807 | 2021101100 | 4401 | NaN | 33 | 2021-10-12T03:30:06.300 | NaN | football | right | 97.48 | 21.26 | 2.23 | 1.24 | 0.22 | NaN | NaN | pass_forward |
| 1097808 | 2021101100 | 4401 | NaN | 34 | 2021-10-12T03:30:06.400 | NaN | football | right | 100.80 | 19.42 | 21.36 | 0.17 | 3.80 | NaN | NaN | None |
| 1097809 | 2021101100 | 4401 | NaN | 35 | 2021-10-12T03:30:06.500 | NaN | football | right | 102.61 | 18.29 | 21.29 | 1.13 | 2.13 | NaN | NaN | None |
| 1097810 | 2021101100 | 4401 | NaN | 36 | 2021-10-12T03:30:06.600 | NaN | football | right | 104.41 | 17.17 | 21.13 | 2.06 | 2.12 | NaN | NaN | None |
| 1097811 | 2021101100 | 4401 | NaN | 37 | 2021-10-12T03:30:06.700 | NaN | football | right | 106.20 | 16.06 | 20.88 | 2.94 | 2.10 | NaN | NaN | None |
| 1097812 | 2021101100 | 4401 | NaN | 38 | 2021-10-12T03:30:06.800 | NaN | football | right | 107.96 | 14.97 | 20.55 | 3.71 | 2.07 | NaN | NaN | None |